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® Methods and products for facHo mJcroblai expression of ONA sequences. 

. @ MlcrobiaJ expression of an exogenous polypeptfde may 
be inefhcient if the messenger RNA has secondary structure 
(due to complementarity of spaced regions of the molecuie) 
which impedes interaction with a ribosome. The occurrerfce of 
mxch structure can be predicted. The degeneracy of the genet¬ 
ic code gives freedom to alter the nudeic add compodtion of 
tt» cONA to eilminate harmful secondary stnxcture of the 
mRNA. 

For example, bovine growth hormone (BGH) is poorly ex¬ 
pressed by £. coii trarwformed with a recombinant dasmid 
containing the natural BGH gene. The corresponding mRNA 
has compiementary regions at 46-51 and 73-78. Ttwefore a 
ptasmid pBGH 33-4 was constructed containing a BGH gene 
wnose upstream portion was synthesized (makirTg use of plas¬ 
mid pHQH 207-1 which contains an effidentty expressible gerte 
for the dosely reiatod human growth hoimorm) so as not to 
cause the troublesome complementarity..£ coJ transformed 
therewith produced BGH with a yield of better than 10* copies . 
per cell. ‘ ‘ . 


ACTORUM AG 


BAD original 



0075444 


METHODS AMD PRODUCTS FOR PACILB MICROBIAL 
EXPRESSION OP DMA SEQOENCES 

5 

present invention provides sethods and oeans for preparing DMA 
sequences that provide messenger RKA having improved translation 
characteristics. The resulting messenger RNA 

B>ay be highly efficient in translation to give substantial amounts 
10 of polypeptide product that is normally heterologous to the host 
microorganism. The DMA sequences which are ultimately expressed, 
that is, transcribed into messenger RMA (mRMA) Which is in turn 
translated into polypeptide product, are, in essential part, 
synthetically prepared, in accordance with this invention, 

15 utilizing means -that favor the substantial reduction or 
elimination of secondary and/or tertiary structure in the 
corresponding transcribed mRNA. An absence or substantial 
reduction in such secondary/tertiary structure involving the S' 
end of mRNA permits effective recognition and binding of 
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ribosomes(s) to the oRMA £or. subsequent, translation* Thus, the 
efficiency of translation is not hindered or iopaired by 
conforaational inpedinenta in the structure of the transcribed 
mKUA. Methods and aeans for ceasurlng oRMA secondary/tertlary 
5 structure are also described as well as associated means designed 
to insure that secondary/tertlary structure is kept below certain 
preferred limits. This invention is exemplified by the preparation 
of various preferred protein products. 


10 


With the advent of recombinant DMA technology, the controlled 
IS microbial production of an enormous variety of useful 

polypeptides has become possible, putting within reach the • 
microbially directed manufacture of hormones, enzymes, 
antibodies, and vaccines useful against a wide variety of 
diseases. Many mammalian polypeptides, such as human growth 
20 hormone axid leukocyte interferons, have already been produced 
by various microorganisms. • 

One basic element of recombinant DMA technology is the 
plasmid, an extrachromosomal loop of double-stranded DMA found 
25 in bacteria oftentimes in multiple copies per cell. Included 
in the information encoded in the plasmid ONA is that required 
to reproduce the plasmid in daughter cells Ci*e., a 
"replicon”) and ordinarily, one or more selection 
characteristics, such as resistance to antibiotics, which 
30 permit clones of the host cell containing the plasmid of 
interest to be recognized and preferentially grown in 
selective media. The utility of such bacterial plasmids lies 
in the fact that they can be specifically cleaved by one or 
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another restriction endonuclease or "restriction enzyae". each 

of which recognises a different site on the plasaidic OilA. 

Heterologous genes or gene fragoents nay be inserted into the 

plasaid by endwise joining at the cleavage site or at 

S reconstructed ends adjacent to the cleavage site. (As used 

• • 
herein# the* tem "heterologous* refers to a gene not 

ordinarily found in« or a polypeptide sequence ordinarily not 

produced by# a given nicroorganisB# whereas the tern 

"homologous* refers to a gene or polypeptide which is found 

10 in# or produced by the corresponding wild-type 

microorganism.) Thus formed are so-called replicable 

expression vehicles. 


DHA recombination is performed outside the microorganism# and 
IS the resulting "recombinant" plasmid can be introduced into 

microorganisms by a process kno%m as transformation and large 
quantities of the heterologous gene-containing recombinant 
plasmid are obtained by growing the transformant. Moreover# 
where the gene is properly inserted with reference to portions 
20 of the plasmid which govern the transcription and translation 
of the encoding DMA# the resulting plasmid can be used to 
actually produce the polypeptide sequence for which the 
Inserted gene codes# a process referred to as expression. 
Plasmids which express a (heterologous) gene are referred to 
25 as replicable expression vehicles. 


Expression is initiated in a DMA region known as the 
promoter. In some cases# as in the lac and trp systems 
discussed infra # promotor regions are overlapped by "operator" 
30 regions to form a combined promotor-operator« Operators are 
DMA sequences which are recognized by so-called repressor 
proteins which serve to regulate the frequency of 
transcription initiation from a particular promoter. In the 
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troncription phase of expression, RNA polyaerase recognizes 
certain sequences in and binds to the prosoter ONA, The 
binding interaction causes an unwinding of the OUA in this 
region, exposing the OKA as a tenplate for synthesis of 
S messenger RNA. The messenger HNA serves as a template for 
ribosomes which bind to the messenger RHA and translate the 
mRWA into a polypeptide chain having the amino acid sequence 
for which the RNA/OHA codes. Each amino acid is encoded by a 
nucleotide triplet or -codon" which collectively make up the 
10 "structural gene*, l.e., that part of the DMA sequence which 
encodes the amino acid sequence of the expressed polypeptide 
product• 

After binding to the promoter, RNA polymerase initiates the 
15 transcription of DNA encoding a ribosome binding site 
including a translation initiation or "start* signal 
(ordinarily ATQ, which in the resulting messenger RNA becomes 
AOG), followed by DNA sequences encoding the structural gene 
itself. So-called translational stop codons are transcribed 
•20 at the end of the structural gene whereafter the polymerase 
may form an additional sequence of messenger RKA which, 
because of the presence of the translational stop signal, will 
remain untranslated by the''•ribosomes. 

Ribosomes bind to the binding site provided on the messenger 
2S RHA, in bacteria ordinarily as the mRNA is being formed, and 
direct subsequently the production of the encoded polypeptide, 
beginning at the translation start signal and ending at the 
previously mentioned stop signalCs). The resulting product 
may be obtained by lysing the host cell and recovering the 
30 product by appropriate purification from other bacterial 
proteins* 
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Polypeptidcs expressed through the use of recoabinont DHA 
technology nay be entirely heterologous* functional proteins* 
as in the case of the direct expression of huaan growth 
hornone* or alternatively nay coaprise a bioactive 
5 heterologous polypeptide portion and* fused thereto* a portion 
of the anino acid sequence of a hoaologous polypeptide* as in 
the case of the production of intermediates for somatostatin 
and the components of human insulin. In the latter cases* for 
example* the fused homologous polypeptide comprised a portion 
10 of the amino acid sequence for beta galactosidase. In those 
cases* the intended bioactive product is rendered bioinactive 
within the fused* homologous/ heterologous polypeptide until 
it is cleaved in an extracellular environment. Fusion 
proteins like those just mentioned can be designed so as to 
15 permit highly specific cleavage of the precusor protein from 
the intended product* as by the action of cyanogen bromide on 
methionine* or alternatively by enzymatic cleavage. See* eg.*- 
G.B. Patent Publication No. 2 007 676 A. 



20 If recombinant DNA technology is to fully sustain its promise* 
systems must be devised which optimize expression of gene 
inserts* so that the intended polypeptide products can be made 
available in controlled environments and in high yielis. 

25 Promoter Systems 

As examples* the beta lactamase and lactose promoter systems 
have-been advantageously used to initiate and sustain 
microbial production of heterologous polypeptides. Details 
30 reTating to the make-up and construction of these promoter 

systems hove been published by Chang e^ a^., Nature 275 * 617 
(1978) and Itakura et ^.* Science 198 * 1056 (1977), which are 
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heceby incocpocated by referanco* More recently, a syateia 
based upon tryptophan, the so-called trp proaoter systeia, has 
been developed. Details relating to the nake*up and 
construction ot this system have been published by Goeddel 
5 ^elelc Acids Research 8, 40S7 (1980) and Rleid 

et ol., U.S.S.ir. 133, 296, filed Kerch 24, 1980, (or the 
equivalent European Patent Publication 0036776). vhich are 
hereby incorporated by reference.- Huxoarous other microbial 
promoters have been discovered and utilized and details 
10 concerning their nucleotide sequences, enabling* a 8)cilled 
worker to ligate them functionally within plasmid vectors, 
have been published —> see, e.g. Siebenlist et al.. Cell 20 , 

269 (1980), which is incorporated herein by this reference. 

15 

Historically, recombinant cloning vehicles (extrachromosomal 
duplex C^A having, inter alia., a functional origin of 
replication) have been prepared and used to transform 
microorganisms cf. Ullrich al., Science 196 , 1313 
20 (1977 }• Z,ater, there were attempts to actually express 

DHA gene inserts encoding a heterologous polypeptide. Itakura 
et al. ( Science 198 , 1056 (1977^) expressed the gene encoding 
somatostatin in E. coll ; Other like successes followed, the 
gene inserts being constructed by organic synthesis using 
25 newly refined technology. In order, among other things, to 
avoid possible proteolytic degradation of the polypeptide 
product within the microbe, the genes were ligated to DHA 
sequences coding for a precursor polypeptide.• Extracellular 
cleavage yielded the intended protein product, as discussed 
30 above. 
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In the ease of larger proteins. Aeaical synthesis of the 
underlying 0^ sequence provod unwieldy. Accordingly, resort 
was had to the preparation of gene sequences by reverse 
transcription fron corresponding sessenger RNA obtained froa 
5 requisite tissues and/or culture cells. These aethods did not 
always prove satisfactory owing to the termination of 
transcription short of the entire sequenceiand/or the desired 
sequence would be accompanied by naturally occurring precursor 
lender or signal DNA. Thus, thoao attempts often have 
10 resulted in incomplete protein product and/or protein product 
in non-cleavable conjugate form ~ cf. Villa-Koaaroff et aX*. 
Proc. Satl. Acad. Sci. (USA) 7S, 3727 (1978) and Seeburg 
et al.. Kature 276 . 793 '(1P78). 

15 In order to avo*id these difficulties. Goeddel ^ al*. Mature 
781 . 544 (1979). constructed OKA. inter alia encoding human 
growth hormone, using chemically synthesised DMA in 
conjunction with enzymatically synthesized DMA. This 
discovery thus made available the means enabling the microbial 
20 expression of hybrid DMA (combination of chemically 

synthesized miA with enzymatically synthesized DMA), notably 
coding for proteins of limited availability which would 
probably otherwise not have been produced economically. The 
hybrid DMA (encoding heterologous polypeptide) is provided in 
23 substantial portion, preferably a majority, via reverse 
transcription of mRMA/while the remainder is provided via 
chemical synthesis. In a preferred embodiment, synthetic DMA 
encoding the first 24 amino acids of human growth hormone 
(HGH) was constructed according to a plan which incorporated 
70 an endonuclease restriction site in the DMA corresponding to 
HGH amino acids 23 and 24. This was done to facilitate a 
connection with downstream HGH cOMA sequences. The various 12 
oligonucleotide long fragments making up the synthetic part of 
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the £U{A were chosen following then.known criteria for gene 
synthesis: avoidance of undue cooplenentarity of the 
fragments* one with another* except* of course* those destined 
to occupy opposing sections of the double stranded sequence; 

5 avoidance of AT rich regions to aininize transcription 

termination; and choice of "nicrobially preferred codons.” 
Following synthesis* the fragments were permitted to effect 
complementary hydrogen bonding and were ligated according to 
methods known per se* This work is docribod. in published 
10 British Patent Specification 2055382 A* which corresponds to 
Goeddel et ^.* U.S.S.B. S5126* filed JUly 5* 1979 which is 
hereby incorporated by this reference. 

While the successful .preparation and expression of such hybrid 
13 X322A provided a useful oieans for preparing heterologous 

polypeptides* it did not address the general problem that 
eocaryotic,genes are not always recognized by procaryotic 
expression machine^ in a way which provides copious amounts 
of end product. Evolution has incorporated sophistication 
20 unique to discrete organisms.. Bearing in mind that the 
eucaryotic gene insert Is heterologous to the procarytic 
organism* the relative inefficiency in expression often 
observed can be true for any gene insert whether it is 
produced chemically* from cOBA or as a hybrid. Thus* the 
25 criteria used to construct the synthetic part of the gene for 
HOI* defined above* are not the sole factors influencing 
expression levels. For example* concentrating on codon choice 
as the previous workers have done—c£. British Patent 
* Specification 2007676 A — has not been completely successful 
in raising the efficiency of expression towards maximal 
expression levels* 
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Cuarante ai;. Science 209, 1428 (1980) expeciaonted with 
several hybrid rlbosoae binding sites, designed to aatch the 
nuober ot base pairs between the Shine-Dalgarno sequence and 
the ATG o£ soae known E. coli binding sites, their work 
suggesting that the reason(s) for observed relatively low 
efficiencies of eucaryotic gene expression by procaryote 
oc^anisDo Is noro subtle* 


10 


IS 


20 


Tbat the initiation ot aHUA translation may be a 
nulticomponent process is illustrated by work reported by 
Iserentant and Flers* Gene 1 (1980). • They postulate that 
secondary structure of nRNA is one of the components 
influencing translation efficiency and imply that the 
initiation codon and ribosome interaction site of secondary 
structured* folded mRiZA must be ^accessible** However* what 
those workers apparently mew by "accessible* is that the 
codon and site referred to be located on the loop, rather than 
the stem, of the secondary structure models they have 
hypothesized* 


The present invention is based upon the discovery that the 
presence of secondary/tertiary conformational structure in the 
mRNA interferes with the initiation and maintenance of ribosomal 
binding during the translation phase of heterologous gene 
25 expression* 



Tlio present invention, reJatlng to those findings, uniquely 
provides methods and means for providing efficient expression of 
heterologous gene inserts by the requisite microbial host. The 
30 present invention is further directed to a method of microbially 
producing heterologous polypeptides, utilizing specifically 
tailored heterologous gene inserts in microbial expression 
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vehicles. as well as associated oeons* It is particularly 
directed to the use of synthetically derived gene insert portions 
that are prepared so as to both encode the desired polypeptide 
product and provide oRNA that has ainiaal secondary/ tertiary 
structure and hence is accessible for efficient ribosooal 
translation* 


In preferred embodiments of the present invention, synthetic DNA 
is provided for a substantial portion of the initial coding segignce of a 
heterologous gene insert# and optionally# upstreaa therefrom 
through the ATG translational start codon and ribosome binding 
site* The critical portion of DHA is chemically synthesized# 
beeping in mind two factorss 1) the creation of a sequence that 
codes for the initial (U-terminal) amino acid sequence of a poly¬ 
peptide comprising a functional protein or bloactive portion 
(hereof and 2) the assurance that said sequence provides# on 
transcription# messenger R£ZA that,has a secondary/tertiary 
conformational structure which is insufficient to interfere with 
its acce88ib^ity for efficient ribosomal translation# as herein 
defined* Such chemical synthesis may use standard organic 
synthesis using modified mononucleotides as building blocks such 
as according to the method of Crea at al * # Kucleic Acids Research 
^# 2331 (1980) and/or the use of sits directed mutagenesis of 
DtIA fragments such as according to the method of Razin et al . * 
Proc. Watl - Acad Sci (U55A) 75, 4268 (1978) and/or 

synthetic primers on certain appropriately sequenced DWA fragments 
followed by specific cleavage of the desired region. 

The present invention is directed to a process of preparing DbZA 
sequences comprising nucleotides arranged sequentially so as to 
encode the proper amino acid sequence of a given polypeptide* 
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This aethod may involve obtaining a substantial portion of 

the DKA coding sequence of a given polypeptide via means other 
than chooical synthesisr sost often by reverse transcription from 
requisite tissue and/or culture cell messenger RNA. This fragaont 
S encodes the C-tenainal portion of the polypeptide and is ligated, 
in accordance herewith# to a remainder of the coding sequence# e.g. 
obtained by chemical synthesis, optionally including properly 
positioned translational start and stop signals and upstream D2IA 
through the ribosome binding site and the first nucleotide (+1) of 

I 

10 the resultant messenger RliA. The synthetic fragment is designed 
by nucleotide choice dependent on conformation of the corresponding 
messenger RKA according to the criteria as herein discussed. 

The thus prepared OUA sequences are suited for insertion and use 
15 in replicable expression vehicles designed to direct the 

production of the heterologous polypeptide in a transformant 
microorganism* In these vehicles# the ONA sequence is operably 
linked to promotor systems which control its expression. The 
invention is further directed to the replicable expression 
10 vehicles and the transformant microorganisms so produced as well 
as to cultures of these microorganisms In fermentation 

media. This invention is further directed to associated methods 
and means and to specific embodiments for the directed production 
of messenger RHA transcripts that are accessible for efficient 
25 ribosomal translation. 

Specifically excluded from the present invention is the hybrid ONA 
encoding human growth hormone (HGH) as disclosed by Goeddel 
at al.# Nature 281 , 544 (1979). While this particular hybrid DMA 
was successfully expressed to produce the intended product, the 
concept of the present invention was not appreciated by these 
workers (and hence not taught by them) and consequently was not 
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practised in the fortuitous preparation of their expressible 
hybrid DMA for HCH. This hybrid DtiA has the foIXowin 9 sequence 
(Table l}i 

Table I 


20 


IS 


60 


80 


20 


120 


25 


140 


20 


phe 

asp 

asn 

ala 

mot 

TTC 

GAT 

AAC 

GCT 

ATG 

phe 

asp 

thr 

tyr 

gin 

TTT 

GAC 

ACC 

TAC 

CAG 

40 

gin 

lys 

tyr 

ser 

phe 

CAG 

AAG 

TAT 

TCA 

TTC 

ser glu 

ser 

lie 

pro 

TCA 

CAC 

TCT 

ATT 

CCG 

lys 

ser 

asn 

leu 

glu 

AAA 

TCC 

AAC 

CTA 

GAG 

ser 

trp 

leu 

glu 

pro 

TCG 

TGG 

CTG 

GAG 

CCC 

100 

ser 

leu 

val 

tyr 

gly 

AGC 

CTA 

GTG 

TAC 

GGC 

lys 

asp 

leu 

glu 

glu 

AAG 

GAG 

CTA 

GAG 

GAA 

asp 

gly 

ser 

pro 

arg 

GAT 

GGC 

AGC 

CGC 

GGG 

lys 

phe 

eisp 

thr 

asn 

AAG 

TTC 

GAC 

AGA 

AAC 

160 

tyr 

gly 

leu 

leu 

tyr 

TAG 

GGG 

-CTG 

CTG 

TAC 

thr 

phe 

leu 

arg 

lie 

AGA 

TTC 

CTG 

GGC 

ATC 


180 191 

val gin cys arg ser val glu gly ser cys gly phe stop 
GTG CAG TGC CGC TCT GTG GAG GGC AcC TGT COG TTC TAG 


0070L 


BAD ORiGrr 



007.5444 


-13- 


The cheoically synthetic DMA 3c<iuQnces hereof extend prefer&bly 
froo the ATG translation initiation site, and optionally upstreaa 
therefroa a given distance, to or beyond the transcription 

initiation sice (labelled -fl by convention), and to sequences 

S downstroara encoding a substantial part of the desired 

polypeptide. Oy way of preference, the synthetic DNA codprlses 
upwards of opproxiootely 75 or more nucleotide pairs of the 
structural gene representing about the proxioal {H«terainal} 25 
amino acids of the intended polypeptide. In particularly 
10 preferred embodiments, the synthetic DHA sequence extends from 

about the translation initiation site (ATG) to about nucleotide 75 
of the heterologous gene. In alternative terms, the synthetic DNA 
sequence comprises nucleotide pairs from *^1 (transcription 
initiation) to about nucleotide 100 of the transcript. 

15 • 

Because of the degeneracy of the genetic coda, there is 
substantial freedom in codon choice for any given amino acid 
sequence. Given this freedom, the number of different DNA 
nucleotide sequences encoding any given amino acid sequence is 
20 exceedingly large', for example, upwards of 2.6 x 10^ 

possibilities for somatostatin consisting of only 14 amino adds. 
Again, the present invention provides methods and moans for 
selecting certain of those DNA sequences, those which will 
efficiently prepare functional product. For a given polypeptide 
25 product hereof, the present invention provides means to select, 
from among the large number possible, those DNA sequences that 
provide transcripts, the conCormational structure of which adraics 
of accossiblity for operable ond efficient ribosomal tronslation. 

30 Conformational structure of mRt^A transcripts is a consequence of 
hydrogen bonding between complementary nucleotide sequences that 
may be separated one from another by a sequence of nonconolcmentary 
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nucleotidcs• Such bonding is cooBOnly referred to as secondary 
structure* So-cailutl tertiary structures oay add to the 
confomation of the overall toolecule* those structures are 
believed to be a result of spatial interactions within one or core 
5 portions of the nolecule ~ so-called stacking interactions* In 
any event* the conforaational structure of a given nBNA solecule 
can be detemined and noasured. Furthermore* 

ve have now discovered that certain levels of conforaational 

structure of aBSA transcripts interfere with efficient protein 
10 synthesis* thus effectively blocking the initiation and/or 
continuation of translation (elongation) into polypeptide 
product* Accordingly* levels at which such conforxaational 
structure does not occur* or at least is minimal* can be 
predicted, tlucleotide choice can be prescribed on the basis of 
15 the predictable*, permissible levels of conformational structure* 
and preferred gene sequences deterained accordingly* 

Ihe aeasureaent of mldlA conforaational structure is determined* in 
accordance- herewith* by measuring the energy levels associated 
20 with the conforaational structure of the mlUIA molecule* 

In determining such energy levels, the thermodynamic disassociation 
energy connected with one or a series of homologous base pairings 
is calculated* for example according to the rules of Tinoco 
25 et al.* Hatura Kew Biol 246 * 40(1973)* In this calculation* AT 
base pairing is assigned an associated energy level of about -1.2 
Kcol/mole while a CG base pairing is assigned an associated energy 
level of about -2 {Ccal/irale. Adjacent homologous pairings are 
more than additive* doubtless due to stacking interactions and 
other associative factors* In any event, it has been determined 
that in those instances where regional base pairing interactions 
resxilt in energy levels stronger than about -12 kcal/nole (that is* 


HAS- 


^ r r- 


0070L 






0075444 

- 15 - 


volueu expressed arlthautically‘in nuabers less then about -12 
kcal/aole) £or a given hooologous sequence, such interactions are 
likely sufficient to hinder or block the translation phase of 
expression, nost probably by interfering with accessibility for 
5 necessary ribosoea) bindinq. 

\ given DUA sequence is screened as follows: A first series of 
base pairs, e.g., approximately the first six base pairs, are 
compared for homology with the corresponding reverse last base 
10 pairs of the gene* If such honol^y is found, the associate 
energy levels are calculated according to the above 
considerations. The first series of base pairs is next compared 
with the corresponding last base pairs up to the penultimate base 
pair of the gene and the associative energy levels of any homology 
15 calculated. In succession the first series of base pairs is next 
compared with the corresponding number of base pairs up to the 
antepenultimate base pair, and so on until the entire gene 
sequence is compared, back to front. IText, the series of base 
pairs beginning one downstream from the first aeries, e.g. base 
20 pairs 2 to 7 of the prior example, is compared with the 

corresponding number from the end cmd progressively tov/ard the 
front of the gene, as described above. This procedure is repeated 
until each base pair is compared for homology with all other 
regions of the gene and associated energy levels are determined. 

25 Thus, for example in Figure 3 there are provided results of such 
scanning and calcuJating for two genes - those encoding natural 
bovine growth hormone (QGH) and synthetic (i.e., hybrid) BCH. It 
can be seen that natural BGH contains two regions of homology 
considered relevant herein (i.e., energy level greater than about 
30 -12 kcai/mole), to wit, six base pairs from base pair 33 to 30 
with homologous pairs 96 to 101 and six base pairs from 46 to 51 
with 73 to 73. The first is not significant for present purpose. 
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(icspitu thu onorj^ level (-15.*10 Kcol/aolc), presuaably because 
the region of hosology liea dovmstreea a sufficient distance so as 
not to be influential on translation efficiency. The second 
region is significant as evidenced by the poor yields of product 

5 as described herein cf. infra . The synthetic BGIl gene where sa^i 
region of hosology was eliainated provided good yields of intended 
protein. 

An embodiment of the present invention will now be described by way 
of example with reference to the accompanying drawings« in 
whichs 

Figure 1 depicts the amino acid and nucleotide sequences of the 
proximal portions of natural BGH, synthetic HGH, and synthetic 
BGH. The amino acids and nucleotides in natural BCHf that are 
different froa those in synthetic HGH are underlined. The 

IS nucleotides in the proximal portion of the synthetic BGH gene that 
differ froa those in the natural BGll gene also are underlined. 

The position of the PVUIZ restriction site at the end of the 
proximal portion of these genes is indicated. 

20 Xn arriving at the synthetic BGH gene encoding the proper amino 

acid sequence for BGH, the nucleotide sequences of natural BGH and 
synthetic HGH were compared. Nucleotide selections were made 
based upon the synthetic HGH gene for construction of the 
synthetic BCH gene taking into account also the latitude permitted 

2S hy the degeneracy of the genetic code, using a minimum of 
nucleotide changes from the synthetic HGH sequence. 

Figure 2 depicts the nucleotide sequences of the sense strands of 
both natural and synthetic BGH genes along with the transcribed 
portions of the respective preceding trp-pronotor sequences. The 
first nucleotide of each transcript is indicated as "-f-l" and the 
following nucleotides are numbered sequentially. The sequences 
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are lined up to natch the translated coding regions of both genes# 
beginning at the start codon *ATG* oC each (overlined) • The 
transcript o£ the natural BGH gene'shot/s an area of "secondary 
structure" due to interactions of nucleotides 46 to 51 with 
5 nucleotides 73 to 78# respectively (see Figure 3), thus creating 
the sten*loop structure depicted. This area is not present in the 
synthetic BGH gene# rec^ved by virtue of nucleotide changes (see 
Figure 1}# which nevertheless retains the correct aaino acid 
sequence• 

Figure 3 shows the locations and stabilities of secondary 
structures in the tr2uiscripts of natural and synthetic BGH. (See 
Figure 2} These locations and stabilities were detemined using a 
nucleotide by nucleotide analysis, as described herein. Each area 
15 of significant secondary structure of each proximal portion of 
gene is listed in the respective table. Thus# for natural BGH 
versus synthetic BGH# it is noted that the energy levels of 
"secondary structure" at corresponding portions of the 
translatable transcripts (namely# nucleotides 46 to 78 comprising 
20 a 6 nucleotide long stem in natural UGU versus nucleotides 52 to 
84 of synthetic BGH) are markedly different (-15.2 kcal/mole 
versus greater than -10 kcal/mole}# accounting for the observed 
success of expression of the synthetic BGH ‘gene versus the natural 
BGH gene# cf. infra . The energy levels indicate the significance 
25 of the relative amounts of tolerable "secondary structure"# i.e.# 
values arithmetically greater than about -12kcal/mole based upon 
thermodynamic energy considerations. The significance of location 
of "secondary structure" can be appreciated by the fact that 
energy levels calculated for positions 33 to 101 versus 38 to 104 
30 of natural versus synthetic BGH# respectively# did not 
significantly influence expression levels. 
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Figure 4 deplete the construction of pOGH 33 used as shown in 
Figure 5. 

Figure S depicts the construction of piosaida harboring ONA 
S sequences for hybrid polypeptidess pBUGH 33-1 used as shown in 
Figure 7, pQHQl being a hybrid of bovine and huaan growtn horaone 
sequences* and pUQGU a hybrid of huaan and bovine sequences* 

Figure 6 depicts the technique used to aasomble the synthetic 
10 proximal portion o£ the BGH gone* p3R 323-01* used in the 
construction shown in Figure 7. 

Figure 7 depicts the construction of the plasmid (p0<2i 33-3) 
harboring the gene for 6GR comprising the synthetic proximal 
portion as shown in Figure 6* 

Figure 8 depicts the construction of expression plasmid pGGH 33-4 
harboring the hybrid BGH gene* 

^0 Figure 9 is the result of a polyacrylamide gel segregation of cell 
protein* Part A shows no BGH production at any cell density using 
the culture containing natural BGH gene. Part B shows the 
expression of synthetic BGH gene (lanes BGH #1 and $2) in the same 
medium as used for Part A. The levels of expression indicated in 
Part B* as opposed to Part A* reflect the production of BGH in 
amounts exceeding about 100 thousand copies per cell. 


In its most preferred embodiment* the invention Is illustrated by 
the microbial production of bovine growth hormone (DGH). BGH is 
endogenous in bovine* e.g.* cattle> and is responsible for proper 
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physical noturation o£ tho aniaal- It is also useful for 
increasing weight gain, feed conversion efficiency, lean to fat 
ratio, and ailk production. Its sequence of 190 amino acids is 
known. See Oayhoff, Atlas of Protein Sequence and Structure 1972 , 

S National Oionedical Research Foundation*, Washington, D.C. The 
present invention nakes possible tho preparation of coaocrcial 
quantities of the eonpound, enabling now its application on a 
large scale in the animal husbandry industry. An initial approach 
toward preparing BGU microbially took advantage of a source of 
10 bovine pituitary glands. Oy extraction and purification, the 
requisite qRNA for BGH was isolated and from it, corresponding 
cDNA prepared. Thus, this initial work resulted in a gene 
corresponding, for all intents and purposes, to the natural DNA 
sequence of BGH. After removal of DNA coding for the presequence 
15 and adding a start codon, the cONA was ligated to a plasmid vector- 
under propex control of a promotor. This plasmid was used to 
transform E. coli host which w'as grown under usual conditions. 

The efficiency of expression of BGH product was poor, a 
consequence, it was discovered, of conformational structure of the 
20 messenger RNA, which greatly reduced its accessibility for ribosomal 
translation, cf. Figure 3. 

For example, it was found that in ■natural" BGH mRNA there are 
regions of complementary homology* One significant region centers 
25 around positions +46 to +51 with a homologous region at positions 
+73 to +78. Secondary 

structure considerations, in these two defined regions, are 
thought to create a hairpin arrangement just downstream from the 
translation start codon ATG and the ribosome binding site* This 
30 conformational arrangement interferes with or prematurely disrupts 
ribosomal binding, and hence, inhibits translation. 
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csco^nltion o£ this phsnoBonon prooptsd investi^stions into 
tha nature oC the DBA sequence In these regions and the discovery 
of nnthods and aeans to obviate the problea* In accordance 
herewith, advantage was taken of a Pvu II endonuclease restriction 
5 site at the BGB DBA corresponding to aaiao acid 24. OHA for the 
first 24 anino acids of OCa chooically synthesised, the 

selection of nucleotides taking into strict account proper coding 
sequence and resultant bPHA secondary/ tertiary structure 
considerations. Employing the oathod defined above, it was found 
10 that certain nucleotide base selections would be suitable, on the 
• basis of predicted conformational structure energy levels, to 
prepare gone sequences properly encoding B<2i but devoid of 
problematic conformational structure. One of these was selected 
and synthesised. Ligations at the Pvu 11 terminus of tha 
13 synthetic piece to the eOBA downstream therefrom produced the 
desired hybrid gene. Construction of a replicable expression 
vector containing said heterologous gone as an operable insert 
successftilly resulted in efficient expression of BGB in 
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The coapletc nuclBOtido (and deduced anino acid) sequence of the 
thus constructed hybrid BGll gene is as follows: 


5 


10 


IS 


20 


25 


1 


net 

phe 

pro 

ala 

met 

ser 

leu 

ser 

gly 

leu 

phe 

ala 

asn 

ala 

val 

ATG 

TTC 

CCA 

GCT 

ATG 

TCT 

CTA 

TCT 

GCT 

CTA 

TTC 

GCT 

AAC 

GCT 

GTT 






20 










leu 

arg 

ala 

gin 

his 

leu 

his 

gin 

leu 

ala 

ala 

asp 

thr 

phe 

lys 

CTT 

CGT 

GCT 

CAG 

CAT 

CTT 

CAT 

CAG 

CTG 

GCT 

GCT 

GAC 

ACC 

TTC 

AAA 











40 





glu 

phe 

glu 

arg 

thr 

tyr 

ile 

pro 

glu 

gly 

gin 

arg 

tyr 

ser 

ile 

GAG 

TTT 

GAG 

CGC 

ACC 

TAC 

ATC 

CCG 

GAG 

GCA 

CAG 

AGA 

TAC 

TCC 

ATC 

gin 

asn 

thr 

gin 

val 

ala 

phe 

cys 

phe 

ser 

glu 

thr 

ile 

pro 

ala 

CAG 

AAC 

ACC 

CAG 

GTT 

GCC 

TTC 

TGC 

TTC 

TCT 

GAA 

ACC 

ATC 

CCG 

CCC 

60 















pro 

thr 

gly 

lys 

asp 

glu 

ala 

gin 

gin 

lys 

ser 

asp 

leu 

glu 

leu 

CCC 

ACG 

GGC 

AAG 

GAT 

GAG 

GCC 

CAG 

CAG 

AAA 

TCA 

GAC 

TTG 

GAG 

CTG 





• 

80 










leu 

arg 

i lo 

our 

Icu 

Icu 

J uu 

ilo 

gin 

ser 

trp 

iQU 

gly 

pro 

leu 

CTT 

CGC 

ATC 

TCA 

CTG 

CTC 

CTC 

ATC 

CAG 

TCG 

TCG 

CTT 

CGG 

CCC 

CTG 











100 





gin phe 

leu 

ser 

arg 

val 

phe 

thr 

asn 

ser 

leu 

val 

phe 

gly 

thr 

CAG 

TTC 

CTC 

AGC 

AGA 

GTC 

TTC 

ACC 

AAC 

AGC 

TTG 

GTG 

TTT 

GGC 

ACC 

ser 

aep 

arg 

val 

tyr 

glu 

lys 

I'eu 

lys 

asp 

leu glu 

glu 

gly 

ile 

TCG 

GAC 

CGT 

GTC 

TAT 

GAG 

AAG 

CTG 

AAG 

GAC 

CTG 

GAG 

GAA 

GGC 

ATC 

120 

* 














leu 

ala 

leu 

met 

arg 

glu 

leu 

glu 

asp 

gly 

thr 

pro 

arg 

ala 

gly 

CTG 

CCC 

CTG 

ATG 

CGG 

GAG 

CTG 

GAA 

GAT 

GGC 

ACC 

CCC 

CCQ 

GCT 

GGG 






140 










gin 

lie 

leu 

lys 

gin 

thr 

tyr 

asp 

lys 

phe 

asp 

thr 

asn 

met 

arg 

CAG 

ATC 

CTC 

AAG 

CAG 

ACC 

TAT 

GAC 

AAA 

TTT 

GAC 

ACA 

AAC 

ATG 

CGC 











160 





ser 

asp 

asp 

ala 

leu 

leu 

lys 

asn 

tyr 

gly 

leu 

leu 

ser 

cys 

phe 

ACT 

GAC 

GAC 

GCG 

CTG 

CTC 

AAG 

AAC 

TAC 

GCT 

CTG 

CTC 

TCC 

TGC 

TTC 

arg 

lys 

asp 

lou 

his 

lys 

thr 

glu 

thr 

tyr 

leu 

arg 

val 

met 

lys 

CGG 

AAG 

GAC 

CTG 

CAT 

AAG 

ACG 

GAG 

ACG 

TAC 

CTG 

AGG 

GTC 

ATG 

AAG 

180 










190 





cys 

arg 

arg 

phe 

gly 

glu 

ala 

ser 

cys 

ala 

phe 

Stop 



TGC 

CGC 

CGC 

TTC 

CGG 

GAG 

GCC 

AGC 

TGC 

GCA 

TTC 

TAG 
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Detalled Description 

Synthesis of Proxical ^rtion of DGH Cone 

5 Twelve fraoBCiits, U i-6 (upper strand) and L 1-6 (lower 
sbxand), vere .syntha«ised. Also synthesised, in order to 
repair the 3* end o£ the gene, were 2 fragaents, BOH Repair (1). 
(upper strand) and BCH Repair (2) (lower strand). 

10 The 14 fragnents were synthesized according to the oethod of 
Crea ^ al., Shcleic Acids Research , 8, 2331 (1980). The 
syntheses of the fragnents ware accoaplished fron the 
appropriate solid support (cellulose) by sequential addition of 
the appropriate folly protected diner - or trlner- blocks. The 
13 cycles were carried out under the same conditions as describe*) 
in the synthesis, of oligothynidilic acid (see Crea et al.. 
Supra .) The final polyner was treated with base (aq. cone 
HHg) and acid (80% aq. B9AC), the polyner pelleted off and 
the supernatant evaporated to dryness. The residue, as 
20 dissolved in 4% aq. was washed with ether (3x) and used 

for the isolation of the fully deprotected fragment. 
Purification was accomplished by hplc on Rsil 
u-partlcolate column. Gel electrophoretic analysis showed that 
each of the fragments, 0,I> l-€ and BGH Repair (1) and (2), had 
the correct sizet 
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Fragodnt 


Sequence 


Sire 



U 1 


AAT.TCT.ATG.TTC.C^' 

13-«Gr 


U 2 


CAG.CTA.TGT.CTC.T^’ 

13-nr 

5 

U 3 


ATC.TGC.TCT.ATT.C^* 

X3-oer 


U 4 


GCT.AAC.GCT.GTT.C^’ 

13-«er 


D 5 


TPC.GTC.CTC.AGC’.A^* 

13-aer 


U 6 


s* tct.tca.tca.gct.ga^’ 

14-aer 


L 1 


ATA.CCT.GGG.AAC.ATA.O^‘ 

16-sttr 

10 

L 2 


ACC.AGA.TAC.AGA.C^' 

I3-mer 


L 3 . 


*’ CGT.TAG.CGA.ATA.G^’ 

13-Ber 


L 4 


GCA.CGA.AGA.ACA.G^' 

13-aer 


h 5 


®‘ ATG.AAG.ATG.CTG.A^* 

13-Ber 


L 6 


AGC.TTC.AGC.TG^' 

ll-aer 

15 

BGH Repair 

(1). 

AA.TTC.AGC.TGC.GCA.TTC.TAG.A^' 

21-aer 


BCH Repair 

(2) 

AG.CTT.CTA.GAA.TGC.GCA.OCT.G^’ 

21-Bsr 




/ 


0075^44 

-M- 

Construction of pDGH 33 

Fresh frozen bovine pituitarles were aecereted and RNA was 

extracted by the guonidiuja thiocyanate nethod. (Hording et al«« 

5 Dioi Qien. 252 (20)# 7391 (1977) ond Ullrich £i*» Science 

196 » 1313 (1977))• The total RHA extract was then passed over 

an oligo-dT cellulose column to purify poly A containing 

nessenger BHA (siRNA). Using reverse transcriptase and ollgo*dT 

as a primer, single stranded cOHA was made from the qRUA. 

10 Second strand synthesis was achieved by use of the Rlenow 

fragment of DMA polymerase I. Following SI enzyme treatment and 

acrylamide gel electrophoresis's size cut of the total cOHA 

(ca. 500-1500 bp) was eluted and cloned into the Pst X site of 
o 

the.amp gene of pBR 322 using traditional tailing and 
IS annealing conditions* 

The pBR 322 plasmids containing cOUA were transformed into 
£• coli K-12 strain 294 (ATCC Mo* 31446}* Colonies containing 
recombinant plasmids ware selected by their resistance to 
20 tetracycline and sensitivity to ampicillin. Approximately 2000 
of these clones were screened for BGH by colony hybridization* 

The c012A clones of. HGH contain an internal 550 bp Haelll 
fragment* The amino acid sequence of this region is very 
2^ similar to the BGH amino acid sequence. This HGH HaeIXI 

fragment was radioactively labeled and used as a probe to find 
the corresponding BGH sequence amongst the 2000 clones* 

Eight positive clones were identified. One of these, pOGH112, 

30 was verified by sequence analysis as BGH* This full-length 
clone is 940 bp long containing the coding region of the 26 
amino acid presequence as well as*the 191 amino acid protein 
sequence. 
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In order to achieve direct OCII exprossion. a synthetic 
"expression pricier" was made havin 9 the sequence 
5 -ATCTTCCCAGCCATO-3*. The nucleotides le the fourth through 
fifteenth position are identical to the codons of the first 4 
5 amino acids of the mature BGH protein, as determined by 

sequence data of pBCa 112 . only the S" A?C (methionine) is 
alien to this region of the protein. This was necessary in 
order to eliminate the presequence region of our BGH clone and 
to provide the proper initiation codon for protein synthesis. 
10 By a series of enzymatic reactions this synthetic primer was 
elongated on the BGH 112 cDHA insert. The primed product was 
cleaved with Pst 1 to give a OHA fragment of 270 bp containing 
coding information up to amino acid 90. (Figure 4) This 
"expression" BGH cONA fragment was ligated into a pBR 322 
which contained the trp promotor. This vector was 
derived from pLalF A trp2S (Goeddel e^ al., Hature 287, 411 
(1980)). The interferon eOtlA was removed and the trp25-322 
vector purified by gel electrophoroals. nie recombinant 
plasmid (pBGH710) now contained the coding information for 
20 anlno acids 1-90 of the nature BGH protein, linked directly to 
the trp promotor. This linkage was verified by DHA sequence 
analysis. The second hnJf of tho ending region and tho 3' 
unbrnnnjAtcH region wan inol.itiiO from by Put I 

roatriction digest and acrylamide gel electrophoresis. This 
25 "back-end" fragment of S40 bp was then ligated into pBGH710 at 
the site of amino acid 90. Recombinant plasmids were checked 
by restriction analysis and DHA sequencing. The recombinant 
plasmid, pBGH33, has the trp promotor directly linked via ATG 
with tho complete DUA coding sequence for mature BGH. 
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Construction o£ gllGH 207-1 


PLasaid pOiX carries the E. coli tryptophan operon containing 
the deletion LE14I3 (G«F. Hiozcari, » (1978} 

5 Oacteriolo<Tv 1457-1466)} and hence expresses a fusion protein 
coaprising the first 6 aaino acids of the trp leader and 
approximately the last third of the trp S polypeptide 
(hereinafter referred to in conjunction as £•£*), as well as the - 
trp 0 polypeptide in its entirety, all under the control of the 
10 trp promoter-operator system. The plasnid, 20 pg, was digested 
with the restriction ensyme PvuXI which cleaves the plasmid at 
five sites. The gene fragments were next combined with EcoRX 
linkers (consisting of a self complementary oligonucleotide of 
the sequencet pCATGAATTCATG) providing an EcoRI cleavage site 
IS tor a later cloning into a plasmid containing an EcoRI site. 

The 2Q pg of OHR fragments obtained from pG^U were treated with 
10 units T^ OZSA ligase in the presence of 200 pico moles of 
the 3 *-phosphorylated synthetic oligonucleotide pCATGAATTCATG 
and in 20pl T^ QUA ligase buffer (20mM tris, pU 7.6, 0.5 aM 
ATP, 10 mM MgCl^f 5 mM dithiothreitol) at 4*C overnight. The 
solution was than heated 10 minutes at 70*C to inactivate 
ligase. The linkers were cleaved by EcoRI digestion and the 
fragments, now with EcoP^ ends, were separated using poly-* 
acrylamide gel electrophoresis (hereinafter "PAGE”} and the 
three largest fragments isolated from the gel by first staining 
with ethidium bromide, locating the fragments with ultraviolet 
light, and cutting from the gel the portions of interest. Each 
gel fragment. With 300 microXiters O.lxTEE, .was placed in a 
dialysis bag and subjected to electrophoresis at 100 V for one 
30 hour in O.lxTBE buffer (TOE buffer contains: 10.8 gm tris 
base, 5.5 gm boric acid, 0.09 gm Ma^EOTA in 1 liter H 20 }. 

The aqueous solution was collected from the dialysis bag. 
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phenol extracted, ehlorotora extracted and aade 0.2 M sodiua 
chloride, and the DZIA recovered in water after ethanol 
precipitation. (All DNA fragment isolations hereinafter 
described are performed using PAGE followed by the 
5 electroelution mothod just discussed*} The trp promoter- 

operator -containing gene with EcoRI stiOey ends was identified 
in the procedure next described, which entails the insertion of 
fragments into a tetracycline sensitive plasmid which, upon 
promoter-operator insertion, becomes tetracycline resistant, 

10 

Plasmid pBRUl, (R.I. Rodrigues, £t al., bfucleic Acids Research 
6, 3267-3287 C1979]) expresses ampicillin resistance and 
contains the gene for tetracycline resistance but, there being 
no associated promoter, does not express that resistance* The 
15 plasmid is accordingly tetracycline sensitive* By introducing 
a promoter-operator system in the EcoRI site, the plasmid can 
be made tetracycline resistant. 

pORIil was digested with EcoRI and the oncyme removed by. 

10 phenol/CHCl^ extraction followed hy chlproform extraction and 
recovered in water after ethanol precipitation. The resulting 
DNA molecule v/as, in separate reaction mixtures, combined with 
each of the three DNA fragments obtained as decribed above and 
ligated with DUA ligase as previously described. The ONA 
25 present in the reaction mixture was used to transform conpetont 
E. coll K-12 strain 294 {K. Backmon et al-, Proc Mat'l Acad Sci 
USA 73 , 4174-4190 (1976) (ATCC no. 31446) by standard 
techniques (V. Hershfield e^ aj^., Proc Nat'l Acad Sci USA 7l , 
3455-3459 (1974) and the bacteria plated on LB plates 
30 containing 20 pg/ml ampicillin and 5 pg/nl tetracycline. 

Several tetracycline-resistant colonies were selected, plasmid 
DNA isolated and the presence of the desired fragment confirmed 
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by restriction cnzyac analysis. The rasulting plasoid, 
designated pBWItrp. expresses 0-lactaaase. iaparting oapicillin 
resistance, and it contains a KIA Cragscnt including the trp 
pronotcr-operator and encoding a first protein coaprising a 
S fusion of the first six oaino acids of the trp leader and 
approxinately tho-last third of the trp B polypeptide (this 
polypeptide is designated LE'). and a second protein 
corresponding to approximately the first half of the trp 0 
polypeptide (this polypeptide is designated 0*), and a third 
i® protein coded for by the tetracycline resistance gene. ' 

pBHB trp was digested with EeoRl restriction cnxyme and the 
resulting fragment 1 isolated by PAGE and slectzoelution. 
EcoRX-digested plasmid pSom 11 (K. Ztakuca et al. Science 198, 
15 1056 {19-77); O.B. patent publication no. 2 007 676 A) was 

combined with this fragment 1 . The mixture was ligated with 
DtlA ligauu as pruvluusly tlusciibi.-d oinl U><j rusuiting Dl/A 
bransformod into B. col^ K—12 strain 294 os previously 
described. Transformant bacteria were selected on 
20 ampicillin-containing plates. Resulting ampicillin-rasistant 
colonies wore screened by colony hybridization (M. Gruenstein 
at ^., Proc Hafl Acad Sci OSA 72, 39S1-396S C197S3) using as 
a probe the trp promoter- operator-containing fragment 1 
isolated from pBRatrpr which had been radioaetively labelled 
with P^^. Several colonies shown positive by colony 
hybridization ware selected, plasmid DBA was isolated and the 
orientation of the inserted fragments determined by restriction 
analysis employing restriction enzymes BgllX and BamilX in 
double digestion. E. coll 294 containing tho plasmid 
designated pSQM7l2, which has the trp promoter—operator 
fragment in the desired orientation was grown In LO oediun 
containing 10 pg/ml aapicillin. Tlie cells were grown to 
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opcicol density I (at 550 nM), collected by centriCugotion and 
resuspended in K9 aedia in tenfold dilution. Cells were grown 
for 2-3 hours, again to optical density 1, then lysed and total 
cellular protein analysed by SDS (sodiua dodcyl sulfate) area 
5 (15 percent) polyacrylamide gel electrophoresis (J.V. Maizel 

Jr. Qt al., Meth Viral 5^, 180-24<S (1971)). 

The plasmid pSoa7A2, lOpg, was cleaved with EcoRI and the DNA 
fragment 1 containing the tryptophan genetic elements was 
XO isolated by PAGE and electroelution. This fragment, 2pg, was 
digested with the restriction endonuclease Tag I, Z units, 10 
minutes at 37*C such that, on the average, only one of the 
approximately five Taq I sites in each molecule is cleaved. 

This partially digested mixture of fragments was separated by 
15 PAGE and an approximately 300 base pair fragment 2 that 
contained one EcoRI end and one Taq X end was isolated by 
electroelution. The corresponding Taq Z site is located 
between the transcription start and translation start sites and 
is 5 nucleotides upstream from the ATG codon of the trp leader 
20 peptide. Ilte DtIA sequence about this site is shorn in Figure 
4. By proceeding as described, a fragment could be isolated 
containing all control elements of the trp operon, i.e«, 
promoter-operator system, transcription initiation signal, and 
part of the trp leader ribosome binding site. 

25 

The Taq X residue at the 3* end of the resulting fragment 
adjacent the translation start signal for the trp leader 
sequence was next converted into an Xbal site. This was done 
by ligating the Fragment 2 obtained above to a plasmid 
30 containing a unique (i.e., only one) EcoRI site and a unique 
Xbal site. For this purpose, one may employ essentially any 
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plasoid cbneaining, in order, a replicon, a selectable Barker 
such as antibiotic resistance, and EcoRZ, XbaX and BaaHZ 
sites. Thus, for exanple, an XbaZ site can bo introduced 
between the EcoRI and BaraHX sites of pBR322 (P. Bolivar et al., 

S Gene 2, 9S-119 C1977]) by, e.g., cleaving at the plaseid's 
unique tUnd XXX site with Hind XXX followed single 
strand—specifie nuclease digestion of the resulting sticky 
ends, and blunt end ligation of a self annealing 
double—stranded synthetic nucleotide containing the recognition 
^9 site sudi as CCTCTiVGAGQ. Alternatively, naturally derived hVA 
fragments nay be employed, as was dons in the present case, 
that contain a single Xbal site between BeoRZ and BaoHI 
cleavage residues. Thus, an EcoRZ and BamHZ digestion product 
of the viral genome of hepatitis 0 was obtained by conventional 
IS means and cloned into the EcoRZ and BamHZ sites of plasmid p<3H6 
(O.V. Gooddol ^ Nature 281 , 544 CZ979])} to form the 

plasmid pBS32. Plasmid pHS32 was cleaved with kbaZ, phenol 
extracted, chloroform extractod and ethanol precipitated. Zt 
was then treated with 1 pi E. coli polymerase Z, Rlenow 
20 fragment (Boehringer-tfannheim) in 30 pi polymerase buffer (SO 
bM potassium phosphate pH 7.4, 7m,M MgCl^, 1 nM 
B-mercaptoathanol} containing O.lmM dTTP and O.lmM dCTP for 30 
minutes at 0*C then 2 hr. at 37*C. This treatment causes 2 of 
the 4 nucleotides complementary to the S' protruding end of the 

29 XbaZ cleavage site to be filled int 

5' CTAGA- S ’ CTAGA- 

3 • T- 3 • TCT- 

30 Two nucleotides, dC and dT, were incoriiorated giving an end 
with two S* protruding nucleotides. This linear residua of 
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plosaid pllS32 (after phenol and chlorofora extraction ard 
recovery in water after ethanol precipitation) was cleaved with 
EcoRI* The largo plasaid Fragacnt was separated from the 
soallcr EcoRI-XbaX fragacnt by PAGE and isolated after 
5 eloctroelution. This DUA fragnent fron pnS32 (0.2 pg), was 

ligated* under conditions siailar to those described above* to 
the EcoRI-Taq I fragment of the tryptophan operon { O.Ol pg). 

In this process the Taq I protruding end is ligated to the Xbal 
remaining protruding end even though it is not completely 
10 Watson-Crich base-paired: 

-T CTAG/v- -^TCTACA- 

' - ^?i 

-AGC TCT- -AGCTCT- 

*A' portion of this ligation reaction mixture wan transformed 
into E. coli 294 cells as in part I. above* heat treated and 
plated on LB plates containing aapicillin. Twenty-four 
colonies were selected* grown in 3 ml LB media* and plasmid 
isolated. Six of these were found to have the Xbal site 
20 regenerated via E. coli catalyzed DWA repair and replication: 

-TCTAGA- -^TCTAGA- 

—AGCTCT- -AGATCT- 

These plasmids were also found to cleave both with EcoRl and 
Hpal and to give the expected restriction fragments. One 
plasmid ^* designated pTrp 14, was used for expression of 
heterologous polypeptides* as next discussed. 

The plasmid pHGH 107 (D.V. Goeddel et al* Nature * 281 * 544, 
1979) contains a gene for human growth hormone made up of 23 
amino acid codons produced from synthetic DNA fragments and 163 
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amino ccid codons obtained from coaplencntary ONA produced via 
CQversQ transcription of human growth hormone aessongor RNA* 
this gens, 2, though it lacks the codons of the ”prs* sequence 
of human growth hormone, does contain an ATG translation 
S initiation codon* The gene was isolated from 10 pg pUOi 107 
after treatment with EcoRI followed by E. coli polynoraso X 
Rleoow fragment and dTTP and dATP as described above. 

Following phenol and chloroforo extraction and ethanol 
precipitation the plasmid was treated with BaoRI* 

10 The human growth hormone ("UGU**} gene-containing trogment 3^ 

isolated by PAGE followed by electroelution* The resulting I^A 
fragment also contains the first.3S0 nucleotides of the 
tetracycline resistance structural gene, but leeks the 
tetracyline promoter-operator system so that, when subsequently 
IS cloned into an expression plonraid, plasmids containing the 
insert can be located by the restoration of tetracycline 
resistance* Because the EcoRX end of the fragment 2 has been ' 
filled in by the Klenow polymerase X procedure, the fragment 
has one blunt and one sticky end, ensuring* proper orientation 
20 when later inserted into an expression plasmid. 

The expression plasmid pTrpl4 was next prepared to receive the 
RGH gene-containing fragment prepared above* Thus, pTrpl4 was 
XbaX digested and the resulting sticky ends filled in with the 
2S fClenow polymerase X procedure employing dATP, dTTP, dGTP and 
dCTP* After ^enol and chloroform extraction and ethanol, 
precipitation the resulting ORA was treated with BaoHX and the 
resulting large plasmid fragment isolated by PAGE and 
electroelution* The pTrpl4-derived fragment had one blunt and 
20 one sticky end, permitting recombination in proper orientation 
with the HGH gene containing fragment 2^ previously described. 
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Tho HGII gencj fragaonc 3 and the pTrpl4 Xba-Banlll fragnont were 
coabined and ligated together under conditions siailar to those 
described above. Tlio Ciliad in Xbal and EcoRI ends ligated 
together by blunt end ligation to recreate both the Xbal and 
the CcoRI site: 


Xbal eilled in 
«_TCTAG 
_AGATC 


EcoRI eilled in HOI gene initiation 

AATTCTATG- —-J^TA^ATTCTATG 

T-rAAGATAC^^ ^ —^AciwOTTAMjATAC_ 

Xbal ^oRI 


This construction also recreates the tetracycline resistance 
gene. Since the plasnid pUGH 107 expresses tetracycline 
resistance from a promoter lying upstream from the HGH gene (the 
lac promoter), this construction, designated pHOH 207, permits 
expression of the gene for tetracycline rosistanee under the 
control of the tryptophan promoter-operator. Thus the ligation 
mixture was transformed Into E. coli 294 and colonies selected on 
LB plates containing 5 jig/ml tetracycline. 


Construction of dBGH 33-1 (Figure 31 

The structure of pHGH207-l which has the entire human growth 
hormone gene sequence is shown. The front part of this gene is 
synthetic as is described by Goeddol e^ , Mature 281, 544 
(1979). In the following a plasmid was constructed containing the 
BGH gene in the sane orientation and in the same position with 
respect to the trp-pronotor as is the HGII gone in pHGH 207-1. 


Twenty pi (i.e. lOpg) of the plasnid DMA was digested wth Ban HI 
and Pvull as follows; To the twenty pi of DMA we added 5 ;il lOX 
restriction ensymo buffer (SOOaM NaCl, 100 nuM Tris llCl pH 7,4, 100 
mfl MgSO^ and 10 nM OTT), 20 pi HjO and 10 units BamHl 
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restriction enc/ae and 2 |ii PvuII restriction oncyno. 

SubsequQntXy« this raaction olxture «ras incubated at 37*C Cor 90 
ainutes* The nixturc was loaded on a 6 percent acrylaoide gol and 
electrophoresis was carried out for 2 hours at SO oA. The IHIA in 
the gel was stained with EthidiuM broaide and visualised with 
UV-light. The band corresponding to the 36S bp (with reference to 
a UaeZXI digest of pBR322) fragoont was excised and inserted in a 
dialysis bag and the OKA was electroeluted using a current of 100 
oA. The liquid was renovod froa the bag and^ its salt 
concentration adjusted to 0.3M KaCl. Two voluaes of ethanol were 
added and the ONA precipitated at -70*C« The DHA was spun dom In 
an Eppundorf centriCugo, washed with 70 percent ethanol and dried 
and resuspended in 10 pi TA£ (10 b 24 Tris HCl pH7*4« 0.1 mT-I EOTA). 
Slmilarlyr the large XbaX Bam HI fragment of pHGH 207-1 and the 
Xbal, partial PvuII 570 bp fragment of pBGH33 were isolated. 

Two pi of each of the thus isolated D2ZA fragments were mixed. 1 
pi lOaM ATP and 1 pi lOx ligase buffer (200 mM Trii HCI pH7.5, 
lOOmM HgCl^t 20 mM OTT) and 1 pi T^ DHA ligase and 2 pi HjO 
were added. Ligation was done over night at 4*C. This mixture 
was used to transform competent E. coli R-12 294 cells as 
follows I 10 ml L-broth was inoculated with E. coli K-12 294 and 
incubated at 37*C in a shaUer bath at 37*C. At of 0.8 the 

cells were harvested by spinning in a Soryall centrifuge for 5 
rain, at 6000 rpm. The cell pellet was washed/resuspended in 0.15 
M HaClr and again spun. The cells were resuspended in 75 oM 
CsClj, 5 fflti MgCl 2 and 10 dH Tris IICl pH7.8 and incubated on 
ice for at least 20 min. The cells were spun down for 5 min at 
2500 rpm and resuspended in the same buffer- To 250 yl of this 
cell suspension each of the ligation mixtures was added and 
incubated for 60 min on ice. The cells wore heat shocked for 90 
seconds at 42*C« chilled and 2 ml I.-broth was added. The cells 
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wecQ allowed to recover by incubation at 37*c Cor 1 hour. 100 pi 
oC this cell suspension was plated on appropriate plates which 
were subsequently incubated over night at 37*c. The plasaid 
structure in several oC the colonies thus obtained is shown in 
5 Figure 5 (pCGH 33-1). 

All further constructions were done using the saae procedures, as 
described above, nutatla nutandia. 

of the hybrid growth honaone eenes HBGH and BHCH 

(Figure 5) 

The two P^II sites in the HGH and BGII genes are at identical 
positions. Exchange of Pyul l fragments is possible without 
15 changing the reading frame of the messenger RNA of these genes. 

The large difference in expression of both genes is due to 
differences in initiation of protein synthesis at the beginning of 
the messages. Therefore, the front part.of both genes were 
exchanged thus constructing hybrid genes that upon transcription 
20 would give hybrid messenger RNAs. Tiie two plasmids, pBH«i and 
pHOGHs wora conatructed as folJowa: 

From PHGH207-1 there were isolated the largo BamHI-Xbal fragment 
and the 857 bp BamHl <partlal) PvuII fragment containing the HGH 
25 gene without its front part. From pBCH33-l there was isolated the 
75 bp l^I-Pvuii fragment that contains the front part of the BGH 
gene. After ligation and transformation pBHGH was obtained. 
pHBGU was constructed in a similar way to pBUGH; in this case the 
back part was derived from pBGH33-l whereas the front part, the 75 
30 bp 3^1-^II fragment, was derived from plIGK207-l. 
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Dcjsicn and clonirtg of the synthetic front part_o£_Ahc BGM 
(Figure 6) 


The DHA- sequence up to the PvuX I site of the BGH and UGH genes 
S codes for 22 amino acids. Since the front part of tlio HCn gene 
had oxcellont protein synthesis initiation proportiesr the 
sequence of the front part of aCU was designed such that the 
nusbor of nucleotide cliangcs in the BGH gene would bo minioal with 
respect to the H<ai gene. Only 14 base pair changes from the 
10 natural BGH sequence were made in order to code for the proper BGH 
aoino acid sequence and reduce conformational structure In the 
prospective oRNA- The DHA sequence is sho%m in Figure 6. The 
sequence ends with EcoRI and Hlndlll sticky ends to make cloning 
in a vector easy. Close to the Hindi IX site is a PvuX X site for 
X5 the proper junction with the remaining part of the BCT gene* 

The fragments 01 to U6 and LX to L6 were synthesized chemically 
according to the procedures described above. .Ml the fragments 
except 01 and L6 ware nixed and kinased. After addition of 01 and 
20 l 6 the mixed fragments were ligated* purified on a 6 percent 
polyacrylamide gel and the 75 bp band extracted and isolated 
according to standard procedures. This fragment was inserted into 
pBR322 that had been cut with Eco RI and Hindi II. Thus plasmid 

pBR322-01 was obtained. 

25 

Replacement of the natural front part of the BGH gone by the 
synthetic front part. (Figure 7) 


From pBR322-01 the cloned ayntliotic front of the BCII gone was 
20 excised with Eco RI and PyuII, and the resulting 70 bp fragment was 
isolated. From pBGU33-l the largo Eco RI -BanH I fragment and the 
875 bp BamH I {partial} Pvu II fragment was isolated. The three 
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fcagaents wore isolated and ligated and used to transfora 
E. coli K-12 294 as doscribed.before. Thus, pBGH33-2 was 
obtained. This plasmid contains the entire BGH gene but does 
not have a proaotor. Therefore. pBCH33-2 was cut with EcoRl 
5 and the ,^-pronotor containing 310 bp EcoRI fragaent derived 
from pHCII207-l•was inserted by ligation. After transformation 
tetracycline resistant colonies were analysed. Tlieroforu, 
those colonies had the inserted trp -promotor oriented towards 
the HGH- and tot-gene as shown in the figure. 

10 ♦ 

Repair of the 3'-end of the BCT gene, fFigure B) 

The sequences beyond the second Pvull site of the BGH gene are 
derived from the HGH gene. One of the amino acids at the end 
IS . is different from that in the natural BGH gene. This 3'-end 
was repaired as follows. A synthetic OKA fragment as shown was 
synthesized. It is flanked by an Eco RI and a Hlnd lll end to 
facilitate cloning and contains a PvuII site and 3 amino acid 
codons and a stop codon in the reading frame of the BGH gene 
20 itself. This fragment was inserted into Eco RI -Hind lll opened 
pBR322. Thus pBR322-02 was Obtained. Subsequently this 
plasnid was cut with Pvull and BamHl and the 360 bp fragment 
was isolated. From pBGH33-3, which has the entire BGH gene 
with the synthetic front part, the large BamHl and Xbd 
25 fragment and the S70 bp Xba l (partial) Pvu ll fragment was 
isolated. These throe fragments were ligated and used to 
transform cells. Thus, pCCtl33-4 was obtained. In this plasnid 
a unique Hindlll site is present between the stop codon of the 
BGH gene and the start codon of the tet-mRNA. Both genes are 
30 transcribed under direction of the ^ promoter. 
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A typicnl growth nudiun used to derepress and produce high 
levels of 801 per liter (figure 9} contains: 5.0 g 
(NU^jjSG^, 6.0 g KjKPO^, 3.0 g (laH2ro^.2H20, 

1.0 g sodiuB citrate, 2.5 g glucose, S og tetracycline, 70 og 
S thianine KCl, and 60 g H9S0^.7H20. 

While the present invention has been described, in its 
preferred eobodioents, with reference to the use of B. coli 
transforaants, it will be appreciated that other nicroorganisas 
10 can be eaployed autatis mutaiid^. Bzaaples of such are other 
E. coli organises, e.g. B. coli 8., B. coli V/3110 ATCC tto. 

31622 (F*. gal", prototroph), B. coli x 1776, ATCC NO. 

31537, B-. coli 01210, E. coli RV308, ATCC No. 31608, etc., 
eubtills strains, Psaudoaonas strains, etc. and 
IS various yeasts, e.g., SnepharnBiyf^ftB eerevlslae many of which 
are deposited and (potentially) available froa recognized 
depository institutions e.g., ATCC. Following the practice of 
this invention and the final expression of intended polypeptide 
product, extraction and purification techniques eay be those 
20 custoaarily employed in this art, known per se. 
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^ method of cuastructing a DUA ucquonce for a 
messenger RNA encoding a polypeptide comprising e 
5 ^ functional protein or a bloactivo portion thereof, said 

OKA sequence being designed for insertion together with 
appropriately positioned translational start and stop 
signals into a microbial expression vector under the 
control of a microbialiy operable proniotor, the method 
10 comprising: 

a) providing a fragment of said DNA sequence 
encoding a C-terminal portion of said polypeptide, said 
fragment encoding a polypeptide conforming in sequence to 
the natural sequence of this polypeptide, 

b} providing a fragment of said DSA sequence 
encoding the N-teralnal portion of said polypeptide, and 

c) ligating* the fragments of steps a) and b) in 
proper reading frame relation to one another. 

nnid fragment of, atop b) being character i r.ncl in that 
the nucleotides thereof are sequentially arrange- so as to 
provide, on transcription, corresponding messenger RKA 
that 

1) properly encodes the respective portion of the 

amino acid sequence of said polypeptide and 
30 

2) demonstrates levels of conformational structure 
insufficient to interfere with its accessibility for 
efficient ribosonal translation. 
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said UuA sequence bcinq i*xclusive u£ tlic hybrid b:;A 
sequence o£ huaan growth horcone set forth in Toble I of 
the specification hereof. 

2. The aethod according to Clala I wherein the first 
nucleotide of said fragaent of step b) corresponds to 
nucleotide of the corresponding aessonger RHA. 

3. The nethod according to Claim 1 wherein the first 
nucleotide of said fragment of step b) corresponds to a 
nucleotide of the translational start signal. 

4. The aethod according to Clain 1 wherein said fragment of 
step b) extends from the nucleotide corresponding to a 
nucleotide of the translational start signal through at 
leut the nucleotide representing the last of about the 
proximal 25 amino acids of said*polypeptide* 

5.. The Dsthod according to Claim 1 wherein said fragment of * 
step b) extends from nucleotide >1 to about nuclcotido 
* +100 of the corresponding mossengar KUA* 

6. The method according to Claim 1 wherein the DNA sequence 
of said fragment of step b) is as depicted in Figure 1 as 
BGH synthetic. 

7. The aethod according to any preceding claim wherein said DMA 
sequence is inserted together with appropriately positioned 
translational start and stop sigoals into a microbial 
expression vector and is therein brought under the control 
of a microbially operable promotor* to provide the 
corresponding microbial expression vehicle. 
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8. Tho Qothod according to Claio 7 wherein a nicroorganisa is 
transformed with said microbial expression vehicle to 
provide the corresponding transformed microorganism. 

9. The method according to Claim 8 wherein the resulting 
transformed microorganism is grown under suitable 
fermentation conditions and caused to produce said 
polypeptide, said polypeptide being subsequently recovered 
from the fermentation medium.* 

10. The method of Claims 8 or 9 wherein said nicroorganiam is 
an B. colt strain and said expression vector is an E. coli 
plasmid. 

•11. A. plasmid selected from pBR 322-01, pBG3 33-3 and pBCH 33-4. 

12. A transformant nicroorganiam harboring one of the plasmids 
according to Claim n. 

13. A culture of a microorganism according to Claim 12. 

A composition of natter comprising bovine growth hormone 
oasontially free of other proteins of bovine orir n. 

14. A microorganism capable of producing bovine growth hormone 
in amounts exceeding about 100 thousand copies per cell. 
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Synthetic 86H 


al/mol 

11 

3* 

length 

kcal/mol 

11.80 

14 

79 
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4.00 

16 

37 
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Bovine Pituifary 
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Synlhttie JSrtni of B6H am 

A«Z MMX 

t I I 

aattcacctocgcattctaga 

GTC6ACGCGTAAGATCTTCGA 


S. SwRZ 
lorgi fro gmin t 


T4 0NAU^m 
TVontform Ccoii 294 
ScNCf OnC^ ttt* CBfOTftt 
SCTMO for fyitltufle biwf 




ajmHZ, Xbal 
tsolott targ« fr c^mtn t 


JV99 Z. portisi PnrS 
tsolot* STObpfr^m^ 


Pkt 1,9am HZ 
ItoCatt 345bp fropmtat 


T4 OHA lipat* 
'nronsform £lco// 294 
Scrttn f»t' eoloniat 


£boRZ 


Xbal BsoHl 
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